Search results for "Small data"

showing 2 items of 2 documents

Estimation of causal effects with small data in the presence of trapdoor variables

2021

We consider the problem of estimating causal effects of interventions from observational data when well-known back-door and front-door adjustments are not applicable. We show that when an identifiable causal effect is subject to an implicit functional constraint that is not deducible from conditional independence relations, the estimator of the causal effect can exhibit bias in small samples. This bias is related to variables that we call trapdoor variables. We use simulated data to study different strategies to account for trapdoor variables and suggest how the related trapdoor bias might be minimized. The importance of trapdoor variables in causal effect estimation is illustrated with rea…

FOS: Computer and information sciencesStatistics and ProbabilityEconomics and EconometricsbiascausalityComputer scienceBayesian probabilityContext (language use)01 natural sciencesStatistics - ComputationMethodology (stat.ME)010104 statistics & probability0504 sociologyEconometrics0101 mathematicsComputation (stat.CO)Statistics - MethodologyestimointiEstimationSmall databayesilainen menetelmä05 social sciences050401 social sciences methodsEstimatorBayesian estimationidentifiabilityConstraint (information theory)functional constraintConditional independencekausaliteettiObservational studyStatistics Probability and UncertaintySocial Sciences (miscellaneous)
researchProduct

Strategies to develop radiomics and machine learning models for lung cancer stage and histology prediction using small data samples

2021

Abstract Predictive models based on radiomics and machine-learning (ML) need large and annotated datasets for training, often difficult to collect. We designed an operative pipeline for model training to exploit data already available to the scientific community. The aim of this work was to explore the capability of radiomic features in predicting tumor histology and stage in patients with non-small cell lung cancer (NSCLC). We analyzed the radiotherapy planning thoracic CT scans of a proprietary sample of 47 subjects (L-RT) and integrated this dataset with a publicly available set of 130 patients from the MAASTRO NSCLC collection (Lung1). We implemented intra- and inter-sample cross-valida…

Lung NeoplasmsComputer scienceBiophysicsGeneral Physics and AstronomySample (statistics)Cross validationMachine learningcomputer.software_genreCross validation; Machine learning; Non-small cell lung cancer; Radiomics; Humans; Lung; Machine Learning; Neoplasm Staging; Carcinoma Non-Small-Cell Lung; Lung NeoplasmsCross-validationSet (abstract data type)Machine LearningNon-small cell lung cancerCarcinoma Non-Small-Cell LungmedicineHumansRadiology Nuclear Medicine and imagingStage (cooking)Lung cancerNon-Small-Cell LungLungNeoplasm StagingSmall dataRadiomicsbusiness.industryCarcinomaGeneral Medicinemedicine.diseaseRandom forestSupport vector machineArtificial intelligencebusinesscomputer
researchProduct